AITopics | sample question

Collaborating Authors

sample question

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

What Does Your Benchmark Really Measure? A Framework for Robust Inference of AI Capabilities

Jo, Nathanael, Wilson, Ashia

arXiv.org Artificial IntelligenceSep-25-2025

Evaluations of generative models on benchmark data are now ubiquitous, and their outcomes critically shape public and scientific expectations of AI's capabilities. Yet growing skepticism surrounds their reliability. How can we know that a reported accuracy genuinely reflects a model's true performance? Evaluations are often presented as simple measurements, but in reality they are inferences: to treat benchmark scores as evidence of capability is already to assume a theory of what capability is and how it manifests in a test. We make this step explicit by proposing a principled framework for evaluation as inference: begin from a theory of capability, and then derive methods for estimating it. This perspective, familiar in fields such as psychometrics, has not yet become commonplace in AI evaluation. As a proof of concept, we address a central challenge that undermines reliability: sensitivity to perturbations. After formulating a model of ability, we introduce methods that infer ability while accounting for uncertainty from sensitivity and finite samples, including an adaptive algorithm that significantly reduces sample complexity. Together, these contributions lay the groundwork for more reliable and trustworthy estimates of AI capabilities as measured through benchmarks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.1959

Country: Asia (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Education > Assessment & Standards > Student Performance (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

A novel interface for adversarial trivia question-writing

Liu, Jason

arXiv.org Artificial IntelligenceMar-11-2024

A critical component when developing question-answering AIs is an adversarial dataset that challenges models to adapt to the complex syntax and reasoning underlying our natural language. Present techniques for procedurally generating adversarial texts are not robust enough for training on complex tasks such as answering multi-sentence trivia questions. We instead turn to human-generated data by introducing an interface for collecting adversarial human-written trivia questions. Our interface is aimed towards question writers and players of Quiz Bowl, a buzzer-based trivia competition where paragraph-long questions consist of a sequence of clues of decreasing difficulty. To incentivize usage, a suite of machine learning-based tools in our interface assist humans in writing questions that are more challenging to answer for Quiz Bowl players and computers alike. Not only does our interface gather training data for the groundbreaking Quiz Bowl AI project QANTA, but it is also a proof-of-concept of future adversarial data collection for question-answering systems. The results of performance-testing our interface with ten originally-composed questions indicate that, despite some flaws, our interface's novel question-writing features as well as its real-time exposure of useful responses from our machine models could facilitate and enhance the collection of adversarial questions. The code for our interface is available at: https://github.com/Zefan-Cai/QAML

interface, quiz bowl, reasoning, (14 more...)

arXiv.org Artificial Intelligence

2404.00011

Country:

South America > Paraguay (0.04)
North America > United States > Maryland (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(11 more...)

Genre: Research Report (0.82)

Industry:

Education (0.48)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.76)

Add feedback

The Complete Collection of Data Science Interviews – Part 1 - KDnuggets

#artificialintelligenceJun-20-2022, 12:23:30 GMT

Were you in the situation when the interviewer asked you a situational or technical question, and you froze up? Just because you were not prepared for it. It happens to many, including me. I have tendencies to freeze during technical interviews, and the hiring manager will take it as my weakness to reject me at the initial stage of the recruitment process. To overcome this problem, I started to look at sample interview questions.

interview, interview question, sample question, (13 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

British doctors go on the defensive due to 'high-performing' 'GP at Hand' app

The Japan TimesJun-29-2018, 06:40:39 GMT

LONDON – A medical chatbot said to perform as well as or even better than human doctors has sparked a war of words in Britain, in a clash over how much the cash-strapped public health service should rely on artificial intelligence. AI company Babylon, which is already working with the National Health Service, claimed its chatbot scored higher marks than real live doctors in "robust tests." The British firm said it quizzed the AI using sample questions for trainee exams set by Britain's Royal College of General Practitioners (RCGP), the professional body for family doctors. The programmed chatbot, a key feature of Babylon's "GP at Hand" app, scored 81 percent when sitting the test for the first time, while the average pass mark over the past five years for doctors was 72 percent, according to the company. Ali Parsa, its founder who presented the findings in London earlier this week, hailed the results as "a landmark." "(They) take humanity a significant step closer to achieving a world where no one is denied safe and accurate health advice," he said in a statement.

artificial intelligence, chatbot, natural language, (9 more...)

The Japan Times

Country:

Europe > United Kingdom (0.64)
Africa > Rwanda (0.06)

Industry: Health & Medicine > Health Care Providers & Services (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

ConferenceCall 2017 04 05 - OntologPSMW

#artificialintelligenceDec-13-2017, 18:07:07 GMT

A fast performing, dictionary-based tagger [1] constitutes EXTRACT's core. The tagger relies on a set of dictionaries that map biological names to corresponding terms in biological ontologies, or to pertinent records in public biological databases.

bioinformatics, michael yu, natural language, (14 more...)

#artificialintelligence

Country:

Europe > Italy (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
Africa (0.05)
Europe > Greece (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Could YOU pass the secretive Oxford entrance exam? University reveals some of its most common questions - and how to answer them

Daily Mail - Science & techOct-11-2016, 22:30:06 GMT

It's a question you might never have considered before – why do older siblings do better on IQ tests than their younger counterparts? But if you want to get into Oxford's experimental psychology program, you'd better be prepared to answer. The university has released a series of questions from tutors who conduct the infamous interviews, revealing the complex problems in everything from mathematics to medicine used to spot the sharpest candidates. Oxford has released a series of questions from tutors who conduct the infamous interviews, revealing the complex problems in everything from mathematics to medicine used to spot the sharpest candidates. Oxford has revealed five interview questions spanning Modern Languages, Medicine, Philosophy, Maths, and Experimental Psychology.

artificial intelligence, oxford, student, (15 more...)

Daily Mail - Science & tech

Country:

Europe > United Kingdom (0.17)
Asia > Philippines (0.05)

Industry: Education (0.51)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.51)

Add feedback